課程資訊
課程名稱
機器學習
Machine Learning 
開課學期
112-1 
授課對象
重點科技研究學院  奈米工程與科學博士學位學程  
授課教師
舒貽忠 
課號
AM7192 
課程識別碼
543 M1180 
班次
 
學分
3.0 
全/半年
半年 
必/選修
選修 
上課時間
星期三6(13:20~14:10)星期五7,8(14:20~16:20) 
上課地點
應113應113 
備註
總人數上限:60人 
 
課程簡介影片
 
核心能力關聯
核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

The course provides a comprehensive introduction to machine learning, with a primary emphasis on the fundamental principles governing learning algorithms. It covers a wide range of topics, including: (1) Supervised Learning: generative and discriminative probabilistic classifiers (Bayes/logistic regression)、least squares regression、Neural Networks (Convolutional Neural Networks, Recurrent Neural Networks);(2) Probabilistic Graphical Model: Hidden Markov model (HMM);(3) Basic Learning Theory:PAC learning and model selection. This course aims to provide students with a robust foundation essential for conducting research in machine learning.  

課程目標
Upon completion, students will be proficient in utilizing calculus, linear algebra, optimization, probability, and statistics to create learning models for diverse real-world challenges. Moreover, they will be well-prepared for advanced research in machine learning and related domains. 
課程要求
中文授課,以板書形式,講解機器學習演算法數學原理。The course is taught in Chinese, utilizing the blackboard for writing and explaining the mathematical principles of machine learning algorithms. 
預期每週課後學習時數
 
Office Hours
另約時間 
指定閱讀
待補 
參考書目
1. C. M. Bishop. Pattern Recognition and Machine Learning, Springer, 2006
2. Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014.
3. O. Calin. Deep Learning Architectures: A Mathematical Approach, Springer, 2020
4. K. P. Murphy. Probabilistic Machine Learning: An Introduction, MIT Press, 2022
5. Y. S. Abu-Mostafa, M. Magdon-Ismail and H. T. Lin. Learning From Data, AMLbook, 2012
6. E. Alpaydin. Introduction to Machine Learning, MIT Press, 2020. 
評量方式
(僅供參考)
   
課程進度
週次
日期
單元主題
第1週
9/06,9/08  Mathematical formulation of a learning problem, Evaluation of a model (loss function), Generalization error, Empirical Risk Minimization (ERM), ERM with inductive bias, Bayes optimal classifier  
第2週
9/13,9/15  Example of Bayes optimal classifiers, Polynomial Threshold Functions, Overfitting, Generalization/Empirical errors vs model complexity 
第3週
9/20,9/22  Example for explaining No-Free-Lunch Theorem, Perceptron Learning Algorithm (PLA) for linearly separable data 
第4週
9/27,9/29  Mean, standard deviations, Bernoulli distribution, examples for Bayes Theorem 
第5週
10/04,10/06  Naive Bayes Classifier based on Bernoulli distribution, Maximum Likelihood Estimation (MLE), Algorithm, example (classification of hand-written digits)  
第6週
10/11,10/13  Naive Bayes Classifier based on Gaussian distribution, Maximum Likelihood Estimation (MLE), Algorithm, decision boundary  
第7週
10/18,10/20  Confusion matrix, ROC curve, discriminative probabilistic model (Logistic Regression) 
第8週
10/25,10/27  Logistic Regression (sentiment example), comparison between generative and discriminative models, MLE for learning parameters 
第9週
11/01,11/03  Optimization, gradient descent, example from logistic regression (in-class coding), stochastic gradient descent, comparison with PLA, nonlinear classifiers using nonlinear transformation 
第10週
11/08,11/10  Neural Networks: abstract neuron, AND, OR and XOR problems, multi-layer perception (MLP), mathematical definition, revisit XOR by Boolean operation 
第11週
11/15,11/17  Neural Networks: explain why the direction calculation of the gradient of loss function with respect to the weights is not efficient; introduce and derive the algorithm of Backpropagation  
第12週
11/22,11/24  Convolutional Neural Network (CNN), convolution in 1D and 2D signals, cross correlation, convolution layer vs fully-connected layer, characteristics of CNN: sparse and weights sharing, receptive field  
第13週
11/29,12/01  Why need convolution? an example of Sobel operator for edge detection, CNN architecture 
第14週
12/06,12/08  Pooling, CNN Explainer, backpropagation in CNN (derivation) 
第15週
12/13,12/15  Information Entropy, Shannon's source coding theorem, cross-entropy loss, Kullback-Leibler divergence  
第16週
12/20,12/22  Final Exam